Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CodedInputReader and CodedOutputWriter #5888

Closed

Conversation

JamesNK
Copy link
Contributor

@JamesNK JamesNK commented Mar 13, 2019

This is very WIP/rough. I had to create a new ISpanMessage<T> interface, and some areas do not have full support for it yet (they are throwing NotSupportException right now).

Nothing has been done on the code gen side. The way the reader and writer are used compared to the streams is almost identitical so code gen changes should be minimal.

Perf numbers:

WriteToByteArray and ParseFromByteArray both allocate the byte array. This matches how serialization from gRPC works today.

WriteToBufferWriter writes to an IBufferWriter<byte> and doesn't allocate at all. For a small message CodedOutputWriter is slower than CodedOutputStream. It is faster for medium/large messages. Still some tuning to be done.

ParseFromReadOnlySequence reads from a sequence. Its allocations are related to creating the message and whatever strings exist on the message. Using CodedInputReader is consistently faster than CodedInputStream.

|                    Method |     Mean |     Error |    StdDev |   Median | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
|-------------------------- |---------:|----------:|----------:|---------:|------------:|------------:|------------:|--------------------:|
|          WriteToByteArray | 377.5 ns | 11.210 ns | 32.876 ns | 370.3 ns |      0.0682 |           - |           - |               288 B |
|        ParseFromByteArray | 385.2 ns |  8.280 ns | 23.624 ns | 377.5 ns |      0.2666 |           - |           - |              1120 B |
|       WriteToBufferWriter | 294.0 ns |  3.658 ns |  3.421 ns | 293.6 ns |           - |           - |           - |                   - |
| ParseFromReadOnlySequence | 338.6 ns |  6.696 ns | 10.621 ns | 336.0 ns |      0.1903 |           - |           - |               800 B |

@jskeet @davidfowl @JunTaoLuo @jtattermusch

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

global.json Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/Google.Protobuf.csproj Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedOutputWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/BufferWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedOutputWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedOutputWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedOutputWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedOutputWriter.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedInputReader.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedInputReader.cs Outdated Show resolved Hide resolved
csharp/src/Google.Protobuf/CodedInputReader.cs Outdated Show resolved Hide resolved
@anandolee anandolee added the c# label Mar 18, 2019
@anandolee anandolee self-assigned this Mar 18, 2019
@jtattermusch
Copy link
Contributor

A prerequisite PR #5835 has been merged, rebase/merge the current master?

@JamesNK JamesNK force-pushed the jamesnk/protobuf-reader-wip branch from fc53476 to e09705a Compare July 8, 2019 03:02
@JamesNK JamesNK changed the title [WIP] Add CodedInputReader and CodedOutputWriter Add CodedInputReader and CodedOutputWriter Jul 8, 2019
@JamesNK JamesNK marked this pull request as ready for review July 8, 2019 05:10
@JamesNK
Copy link
Contributor Author

JamesNK commented Jul 8, 2019

I've rebased on master and addressed most PR feedback.

This PR isn't finished, but before figuring out every issue that needs to be solved I'd like some feedback from the protocol buffers team. Is this PR on the right track? What do you want to see before the PR can be merged?

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@JamesNK
Copy link
Contributor Author

JamesNK commented Jan 24, 2020

@jtattermusch Tests now pass.

  1. I've made the reader always use a span buffer on the fast path (all the content is in the current span). The slow path will always check the current limit. This is similar to what CodedInputStream does.
  2. While making these changes I changed the reader to use Int32 as the length/limit instead of Int64. This is consistent with CodedInputStream, but it limits the reader to 4GB messages. Good/bad? This can be changed back if we want.

Take a look 🙏

@@ -86,7 +86,7 @@ internal CodedInputReader(ReadOnlySequence<byte> input, int recursionLimit)
this.lastTag = 0;
this.recursionDepth = 0;
this.recursionLimit = recursionLimit;
this.currentLimit = long.MaxValue;
this.currentLimit = (int)this.reader.Length;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not slice the input upfront?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice what up front?

Setting the current limit here lets us use this value as "all the remaining content". It avoids an additional check to long.MaxValue when testing the current limit.

@JamesNK
Copy link
Contributor Author

JamesNK commented Jan 24, 2020

Parsing benchmarks are the same, maybe even slightly faster. Perf gain might be from the Int64 -> Int32 length/limit change.

|                    Method |        Configuration |           Mean |         Error |        StdDev |         Median |    Gen 0 |    Gen 1 |    Gen 2 |  Allocated |
|-------------------------- |--------------------- |---------------:|--------------:|--------------:|---------------:|---------:|---------:|---------:|-----------:|
|    ParseFromNewByteString | GoogleMessage1 Empty |       170.0 ns |       3.44 ns |       5.05 ns |       169.9 ns |   0.0842 |        - |        - |      352 B |
| ParseFromReadOnlySequence | GoogleMessage1 Empty |       166.3 ns |       2.82 ns |       2.50 ns |       165.5 ns |   0.0572 |        - |        - |      240 B |
|    ParseFromNewByteString | GoogleMessage1 Large | 9,959,132.3 ns | 330,097.62 ns | 962,910.25 ns | 9,597,186.7 ns | 484.3750 | 453.1250 | 453.1250 | 14157012 B |
| ParseFromReadOnlySequence | GoogleMessage1 Large | 9,718,207.7 ns | 227,172.97 ns | 669,824.64 ns | 9,773,831.2 ns | 359.3750 | 328.1250 | 328.1250 |  9439786 B |
|    ParseFromNewByteString | Googl(...)edium [21] |    17,372.4 ns |     217.11 ns |     203.09 ns |    17,359.0 ns |   6.7444 |   0.0610 |        - |    28296 B |
| ParseFromReadOnlySequence | Googl(...)edium [21] |    14,712.3 ns |     174.29 ns |     145.54 ns |    14,716.4 ns |   4.5166 |        - |        - |    18952 B |
|    ParseFromNewByteString | GoogleMessage1 Small |       480.6 ns |       5.64 ns |       4.71 ns |       480.4 ns |   0.1469 |        - |        - |      616 B |
| ParseFromReadOnlySequence | GoogleMessage1 Small |       490.7 ns |       9.79 ns |      16.88 ns |       484.5 ns |   0.1144 |        - |        - |      480 B |

@jtattermusch
Copy link
Contributor

@JamesNK
Copy link
Contributor Author

JamesNK commented Jan 24, 2020

Fixed by removing a ReachedLimit check at the end of ReadTag. I'm not sure what its purpose is. No unit tests use it.

@jtattermusch
Copy link
Contributor

FTR, these are the performance results I got at commit bdf9f1a.
As said before, the result, the SerializationBenchmark scenarios are a bit hard to interpret (and it's not clear why there is GoogleMessage1 scenarios vs the google_message1_proto3.pb) aren't necessarily super representative (odd choice of fields that get serialized), but for now they are good enough to get some idea about how the performance of the buffer vs non-buffer serialization comparison - we can put some more effort into choosing the right set of microbenchmarks in the future.

BenchmarkDotNet=v0.12.0, OS=debian rodete
Intel Xeon CPU 2.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100
[Host] : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT
DefaultJob : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ParseWrapperFieldsFromByteArray 824.9 ns 16.22 ns 33.86 ns 0.0792 - - 1864 B
ParseWrapperFieldsFromNewByteArray 958.1 ns 19.10 ns 48.95 ns 0.0839 - - 1984 B
ParseWrapperFieldsFromReadOnlySequence 1,059.5 ns 21.01 ns 40.48 ns 0.0744 - - 1776 B
ParsePrimitiveFieldsFromByteArray 484.2 ns 9.80 ns 26.66 ns 0.0429 - - 1024 B
ParsePrimitiveFieldsFromNewByteArray 607.9 ns 12.21 ns 34.03 ns 0.0477 - - 1128 B
ParsePrimitiveFieldsFromReadOnlySequence 595.3 ns 11.92 ns 19.91 ns 0.0391 - - 936 B

BenchmarkDotNet=v0.12.0, OS=debian rodete
Intel Xeon CPU 2.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100
[Host] : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT
DefaultJob : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT

Method Configuration Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
WriteToStream GoogleMessage1 Empty 585.96 ns 17.070 ns 49.250 ns 582.43 ns 0.1764 0.0019 - 4168 B
WriteToBufferWriter GoogleMessage1 Empty 67.92 ns 1.343 ns 1.437 ns 68.09 ns - - - -
WriteToPipeWriter GoogleMessage1 Empty 564.02 ns 11.339 ns 22.115 ns 574.54 ns 0.0029 - - 88 B
ToByteArray GoogleMessage1 Empty 138.83 ns 2.734 ns 5.068 ns 140.89 ns 0.0029 - - 72 B
ParseFromByteString GoogleMessage1 Empty 177.60 ns 3.574 ns 5.458 ns 178.16 ns 0.0138 - - 328 B
ParseFromNewByteString GoogleMessage1 Empty 263.82 ns 5.347 ns 9.642 ns 264.32 ns 0.0148 - - 352 B
ParseFromStream GoogleMessage1 Empty 738.28 ns 21.937 ns 64.338 ns 727.73 ns 0.1888 0.0029 - 4448 B
ParseFromReadOnlySequence GoogleMessage1 Empty 245.81 ns 4.884 ns 5.624 ns 246.15 ns 0.0100 - - 240 B
WriteToStream GoogleMessage1 Large 11,719,441.04 ns 234,300.441 ns 230,114.362 ns 11,629,133.75 ns - - - 528489 B
WriteToBufferWriter GoogleMessage1 Large 4,592,758.73 ns 89,795.642 ns 106,895.324 ns 4,583,496.78 ns - - - -
WriteToPipeWriter GoogleMessage1 Large 6,565,327.56 ns 130,412.649 ns 251,260.900 ns 6,519,200.16 ns 304.6875 164.0625 23.4375 4584278 B
ToByteArray GoogleMessage1 Large 14,947,347.86 ns 292,638.836 ns 480,814.000 ns 14,972,486.84 ns 218.7500 218.7500 62.5000 4718946 B
ParseFromByteString GoogleMessage1 Large 13,140,501.88 ns 254,355.581 ns 380,707.457 ns 12,965,723.47 ns 375.0000 343.7500 187.5000 9438142 B
ParseFromNewByteString GoogleMessage1 Large 16,797,887.32 ns 354,764.314 ns 1,012,162.551 ns 16,676,895.09 ns 687.5000 562.5000 406.2500 14156661 B
ParseFromStream GoogleMessage1 Large 15,055,567.97 ns 324,534.238 ns 956,896.523 ns 15,133,358.80 ns 562.5000 437.5000 281.2500 10491921 B
ParseFromReadOnlySequence GoogleMessage1 Large 19,694,659.68 ns 391,373.734 ns 996,171.112 ns 19,679,444.11 ns 531.2500 468.7500 281.2500 9438683 B
WriteToStream Googl(...)edium [21] 23,682.87 ns 446.180 ns 458.194 ns 23,642.08 ns 0.1526 - - 4168 B
WriteToBufferWriter Googl(...)edium [21] 9,107.29 ns 180.560 ns 301.675 ns 9,250.01 ns - - - -
WriteToPipeWriter Googl(...)edium [21] 10,558.40 ns 100.017 ns 78.087 ns 10,557.72 ns - - - 264 B
ToByteArray Googl(...)edium [21] 24,258.73 ns 287.744 ns 255.078 ns 24,225.96 ns 0.3662 - - 9304 B
ParseFromByteString Googl(...)edium [21] 22,276.26 ns 340.168 ns 318.194 ns 22,225.52 ns 0.7935 0.0305 - 19040 B
ParseFromNewByteString Googl(...)edium [21] 24,795.56 ns 491.308 ns 735.366 ns 24,708.33 ns 1.1902 0.0305 - 28296 B
ParseFromStream Googl(...)edium [21] 23,531.90 ns 546.954 ns 748.677 ns 23,280.62 ns 0.9766 0.0305 - 23160 B
ParseFromReadOnlySequence Googl(...)edium [21] 20,997.95 ns 359.921 ns 319.061 ns 20,971.58 ns 0.7935 0.0305 - 18952 B
WriteToStream GoogleMessage1 Small 987.37 ns 27.209 ns 80.227 ns 986.19 ns 0.1755 0.0019 - 4168 B
WriteToBufferWriter GoogleMessage1 Small 204.85 ns 4.088 ns 4.708 ns 202.08 ns - - - -
WriteToPipeWriter GoogleMessage1 Small 757.89 ns 14.577 ns 19.953 ns 753.86 ns 0.0029 - - 88 B
ToByteArray GoogleMessage1 Small 374.36 ns 3.559 ns 2.778 ns 373.72 ns 0.0038 - - 96 B
ParseFromByteString GoogleMessage1 Small 560.42 ns 11.545 ns 10.234 ns 559.49 ns 0.0238 - - 568 B
ParseFromNewByteString GoogleMessage1 Small 605.40 ns 12.066 ns 25.451 ns 611.67 ns 0.0257 - - 616 B
ParseFromStream GoogleMessage1 Small 1,299.70 ns 36.213 ns 106.776 ns 1,306.19 ns 0.1984 0.0019 - 4688 B
ParseFromReadOnlySequence GoogleMessage1 Small 611.78 ns 12.147 ns 19.959 ns 618.58 ns 0.0200 - - 480 B
WriteToStream googl(...)roto3 [22] 1,447.22 ns 28.820 ns 79.378 ns 1,440.36 ns 0.1755 0.0019 - 4168 B
WriteToBufferWriter googl(...)roto3 [22] 621.40 ns 12.268 ns 25.336 ns 627.89 ns - - - -
WriteToPipeWriter googl(...)roto3 [22] 1,173.86 ns 23.198 ns 26.715 ns 1,182.99 ns 0.0019 - - 88 B
ToByteArray googl(...)roto3 [22] 996.30 ns 19.914 ns 42.867 ns 1,009.79 ns 0.0114 - - 296 B
ParseFromByteString googl(...)roto3 [22] 1,105.30 ns 21.938 ns 26.115 ns 1,104.02 ns 0.0362 - - 896 B
ParseFromNewByteString googl(...)roto3 [22] 1,219.48 ns 24.263 ns 48.455 ns 1,226.64 ns 0.0477 - - 1152 B
ParseFromStream googl(...)roto3 [22] 1,919.07 ns 41.945 ns 123.019 ns 1,934.25 ns 0.2098 - - 5016 B
ParseFromReadOnlySequence googl(...)roto3 [22] 1,244.52 ns 21.112 ns 18.715 ns 1,245.06 ns 0.0343 - - 808 B

Copy link
Contributor

@jtattermusch jtattermusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I'm done with reviewing this mega PR, I think now things are in a good enough shape for this to get merged.
Thanks for bearing with me and thanks for addressing all the comments.

LGTM.

The next step is for someone from the protobuf team to review the overall design and the high-level technical approach (let's consider this PR approved from the C# perspective).

@haberman this is now ready for the final review. The overall design is documented here:
#6874

@jtattermusch
Copy link
Contributor

This has been superseded by #7351 (we still need to add support for serialization).

@davidfowl
Copy link

What’s the API usage

@JamesNK
Copy link
Contributor Author

JamesNK commented May 11, 2020

var reply = HelloReply.Parser.ParseFrom(data);

Data is a ReadOnlySequence<byte>.

@jtattermusch
Copy link
Contributor

What’s the API usage

Basically the entry point is message.Parser.ParseFrom(ReadOnlySequence data)

@davidfowl
Copy link

Ref or in?

@JamesNK
Copy link
Contributor Author

JamesNK commented May 11, 2020

Neither - https://github.com/protocolbuffers/protobuf/pull/7351/files#diff-1518fe4210577434130a7ae84f789c4eR124

ReadOnlySequence<byte> is big enough that it benefits from not being copied, right?

Feel free to review the PR and add comments if you have any improvements #7351

@davidfowl
Copy link

ReadOnlySequence is big enough that it benefits from not being copied, right?

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.